-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[mono][infra] Disable failing apple mobile tests #92108
[mono][infra] Disable failing apple mobile tests #92108
Conversation
Tagging subscribers to this area: @directhex Issue DetailsWork in progress. This PR aims to disable failing tests on the CI.
|
/azp run runtime-ioslike,runtime-ioslikesimulator |
Azure Pipelines successfully started running 2 pipeline(s). |
/azp run runtime-ioslike,runtime-ioslikesimulator |
Azure Pipelines successfully started running 2 pipeline(s). |
/azp run runtime-ioslikesimulator |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-ioslikesimulator |
Azure Pipelines successfully started running 1 pipeline(s). |
This PR will also address: #92129 |
/azp run runtime-ioslikesimulator |
Azure Pipelines successfully started running 1 pipeline(s). |
Should we also consider this to be covered here: #90460 as we currently are not able to reproduce the failure locally |
Azure Pipelines successfully started running 2 pipeline(s). |
Yes, absolutely. The test is disabled and the tracking issue is added.. |
/azp run runtime-ioslikesimulator |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-ioslikesimulator |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-ioslikesimulator |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-ioslikesimulator |
Azure Pipelines successfully started running 1 pipeline(s). |
With more runtime tests enabled, the timeout for simulators went from 00:30:00 to 03:00:00. |
src/libraries/sendtohelixhelp.proj
Outdated
@@ -37,8 +37,9 @@ | |||
'$(Scenario)' == 'gcstress0xc_jitstress1' or | |||
'$(Scenario)' == 'gcstress0xc_jitstress2' or | |||
'$(Scenario)' == 'gcstress0xc_jitminopts_heapverify1'">06:00:00</_workItemTimeout> | |||
<_workItemTimeout Condition="'$(_workItemTimeout)' == '' and ('$(TargetOS)' == 'iossimulator' or '$(TargetOS)' == 'tvossimulator' or '$(TargetOS)' == 'maccatalyst' or '$(TargetOS)' == 'android')">00:30:00</_workItemTimeout> | |||
<_workItemTimeout Condition="'$(_workItemTimeout)' == '' and ('$(TargetOS)' == 'ios' or '$(TargetOS)' == 'tvos')">00:45:00</_workItemTimeout> | |||
<_workItemTimeout Condition="'$(_workItemTimeout)' == '' and ('$(TargetOS)' == 'iossimulator' or '$(TargetOS)' == 'tvossimulator' or '$(TargetOS)' == 'maccatalyst')">03:00:00</_workItemTimeout> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how I feel about this change. This has the potential to back things up considerably should the work item go on and on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that simulators are too slow. Since there is a coverage on devices, we may reduce the testing scope on the simulators, disabling the JIT
and SIMD
subsets for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3 hours seems like too much to me. Is it truly that slow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does it spend the most amount of time: building vs execution ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is tvossimulator-x64 Release AllSubsets_Mono_RuntimeTests:
During the cancelled Send to Helix
job - JIT_Intrinsics
alone take up: ~24mins
from
https://helix.dot.net/api/jobs/5ff7ae64-ef29-4e8c-88dd-3c3a7481ed6c/workitems/JIT_Intrinsics?api-version=2019-06-17
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rolfbjarne do you also experience long execution times on ios-* simulators on xamarin's CI?
Usually just a few minutes, and it's not changed recently. Then again, we don't have that many tests either.
I find it weird that the simulator is so much slower than device though for you (assuming you run the same set of tests for both), in our experience simulator has always been the faster of the two.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the test execution on simulators is not identical to the devices. For example, for tracing/eventpipe/buffersize
the test duration is almost the same. However, there is a log on simulators after each test that kills the simulator, which may stale the execution:
info: Application has finished with exit code: 100 (as expected)
info: Cleaning up simulator 'iPhone X (iOS 15.0) - created by XHarness'
dbug:
dbug: Running launchctl remove com.apple.CoreSimulator.CoreSimulatorService
dbug: Process simctl exited with 137
dbug: Process launchctl exited with 0
dbug:
dbug: Running killall -9 "iPhone Simulator" "iOS Simulator" Simulator "Simulator (Watch)" com.apple.CoreSimulator.CoreSimulatorService ibtoold
dbug: Process killall exited with 0
info: Simulators cleaned up
dbug: Saving diagnostics data to '/tmp/helix/working/BA6B0A0B/w/9E9B08FA/e/diagnostics.json'
XHarness exit code: 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the xharness it is expected:
/~https://github.com/dotnet/xharness/blob/a3a749a7056623c665bba226fe843152f413f044/src/Microsoft.DotNet.XHarness.Apple/Orchestration/BaseOrchestrator.cs#L476-L496
I measured the interval between two tracing/eventpipe
tests. Assuming both test executions are approximately the same, it takes about 22s on a device and about 64s on a simulator, which is almost 3x slower.
At the start of the test execution, xharness tries to shutdown the simulator:
info: Looking for available ios-simulator-64 simulators..
dbug: Looking for available ios-simulator-64 simulators. Storing logs into list-ios-simulator-64-20230920_033625.log
info: Found simulator device 'iPhone X (iOS 15.0) - created by XHarness'
info: Getting app bundle information from '/tmp/helix/working/BA6B0A0B/w/9E9B08FA/e/tracing_eventpipe.app'..
dbug:
dbug: Running /usr/libexec/PlistBuddy -c "Print CFBundleName" /tmp/helix/working/BA6B0A0B/w/9E9B08FA/e/tracing_eventpipe.app/Info.plist
dbug: Process PlistBuddy exited with 0
dbug:
dbug: Running /usr/libexec/PlistBuddy -c "Print CFBundleIdentifier" /tmp/helix/working/BA6B0A0B/w/9E9B08FA/e/tracing_eventpipe.app/Info.plist
dbug: Process PlistBuddy exited with 0
dbug:
dbug: Running /usr/libexec/PlistBuddy -c "Print UIRequiredDeviceCapabilities" /tmp/helix/working/BA6B0A0B/w/9E9B08FA/e/tracing_eventpipe.app/Info.plist
dbug: Process PlistBuddy exited with 1
dbug: Property UIRequiredDeviceCapabilities not present in Info.plist, assuming 32-bit is not supported
dbug:
dbug: Running /usr/libexec/PlistBuddy -c "Print CFBundleExecutable" /tmp/helix/working/BA6B0A0B/w/9E9B08FA/e/tracing_eventpipe.app/Info.plist
dbug: Process PlistBuddy exited with 0
info: Reseting simulator 'iPhone X (iOS 15.0) - created by XHarness'
dbug:
dbug: Running launchctl remove com.apple.CoreSimulator.CoreSimulatorService
dbug: Process launchctl exited with 0
dbug:
dbug: Running killall -9 "iPhone Simulator" "iOS Simulator" Simulator "Simulator (Watch)" com.apple.CoreSimulator.CoreSimulatorService ibtoold
dbug: No matching processes belonging to you were found
dbug: Process killall exited with 1
dbug:
dbug: Running /Applications/Xcode131.app/Contents/Developer/usr/bin/simctl shutdown A912189F-5CF6-4921-B4EA-DDCD1ED23F10
dbug: An error was encountered processing the command (domain=com.apple.CoreSimulator.SimError, code=405):
dbug: Unable to shutdown device in current state: Shutdown
dbug: Process simctl exited with 149
dbug:
After that, it restarts the simulator:
dbug: Running /Applications/Xcode131.app/Contents/Developer/usr/bin/simctl shutdown A912189F-5CF6-4921-B4EA-DDCD1ED23F10
dbug: An error was encountered processing the command (domain=com.apple.CoreSimulator.SimError, code=405):
dbug: Unable to shutdown device in current state: Shutdown
dbug: Process simctl exited with 149
info: Simulator reset finished
There is a command which looks like creating a new simulator at the start of the test execution:
dbug: Xamarin.Hosting: Booting iPhone X (iOS 15.0) - created by XHarness...
dbug: Xamarin.Hosting: Booted iPhone X (iOS 15.0) - created by XHarness successfully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The simulator runs get unstable after running for a while. The shutdown / restart is meant to protect against that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that the xcrun simctl shutdown
and xcrun simctl boot
are time-consuming. Without the --reset-simulator
parameter, the tests take the same amount of time as they do for devices.
I suggest disabling it for the runtime tests and monitoring the CI.
/azp run runtime-ioslikesimulator |
Azure Pipelines successfully started running 1 pipeline(s). |
@ivanpovazan @steveisok Please take a look again. |
@@ -3699,6 +3699,12 @@ | |||
<ExcludeList Include = "$(XunitTestBinBase)/baseservices/exceptions/unhandled/**"> | |||
<Issue>System.Diagnostics.Process is not supported</Issue> | |||
</ExcludeList> | |||
<ExcludeList Include = "$(XunitTestBinBase)/JIT/SIMD/Vector3Interop_r/**"> | |||
<Issue>/~https://github.com/dotnet/runtime/issues/92129</Issue> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once this PR lands we should remove blocking CI labels.
@@ -5,6 +5,8 @@ | |||
<AllowUnsafeBlocks>true</AllowUnsafeBlocks> | |||
<InvariantGlobalization>true</InvariantGlobalization> | |||
<CLRTestTargetUnsupported Condition="'$(IlcMultiModule)' == 'true'">true</CLRTestTargetUnsupported> | |||
<!-- Tracking issue: /~https://github.com/dotnet/runtime/issues/90460 --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
This PR aims to disable failing tests on the CI. Tracking issues are added for disabled tests.